A Linear-dependence-based Approach to Design Proactive Credit Scoring Models

نویسندگان

  • Roberto Saia
  • Salvatore Carta
چکیده

The main aim of a credit scoring model is the classification of the loan customers into two classes, reliable and unreliable customers, on the basis of their potential capability to keep up with their repayments. Nowadays, credit scoring models are increasingly in demand, due to the consumer credit growth. Such models are usually designed on the basis of the past loan applications and used to evaluate the new ones. Their definition represents a hard challenge for different reasons, the most important of which is the imbalanced class distribution of data (i.e., the number of default cases is much smaller than that of the non-default cases), and this reduces the effectiveness of the most widely used approaches (e.g., neural network, random forests, and so on). The Linear Dependence Based (LDB) approach proposed in this paper offers a twofold advantage: it evaluates a new loan application on the basis of the linear dependence of its vector representation in the context of a matrix composed by the vector representation of the non-default applications history, thus by using only a class of data, overcoming the imbalanced class distribution issue; furthermore, it does not exploit the defaulting loans, allowing us to operate in a proactive manner, by addressing also the cold-start problem. We validate our approach on two real-world datasets characterized by a strong unbalanced distribution of data, by comparing its performance with that of one of the best state-of-the-art approach: random forests.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Investigating the missing data effect on credit scoring rule based models: The case of an Iranian bank

Credit risk management is a process in which banks estimate probability of default (PD) for each loan applicant. Data sets of previous loan applicants are built by gathering their data, and these internal data sets are usually completed using external credit bureau’s data and finally used for estimating PD in banks. There is also a continuous interest for bank to use rule based classifiers to b...

متن کامل

Using the Hybrid Model for Credit Scoring (Case Study: Credit Clients of microloans, Bank Refah-Kargeran of Zanjan, Iran)

In any country, commercial banks lay the groundwork for economic growth by collecting national resources and capitals and allocating them to different economic sectors. Optimal allocation of resources is especially important in achieving this goal. Banks with an effective and dynamic system of customer assessment can efficiently allocate their resources to customers regardless of their geograph...

متن کامل

Dependence of Default Probability and Recovery Rate in Structural Credit Risk Models: Empirical Evidence from Greece

The main idea of this paper is to study the dependence between the probability of default and the recovery rate on credit portfolio and to seek empirically this relationship. We examine the dependence between PD and RR by theoretical approach. For the empirically methodology, we use the bootstrapped quantile regression and the simultaneous quantile regression. These methods allow to determinate...

متن کامل

Using DEA for Classification in Credit Scoring

Credit scoring is a kind of binary classification problem that contains important information for manager to make a decision in particularly in banking authorities. Obtained scores provide a practical credit decision for a loan officer to classify clients to reject or accept for payment loan. For this sake, in this paper a data envelopment analysis- discriminant analysis (DEA-DA) approach is us...

متن کامل

Adaptive Credit Scoring with Kernel Learning Methods - Abstract

Credit scoring is a method of modelling potential risk of credit applications. Traditionally, logistic regression, linear regression and discriminant analysis are the most popular approaches for building credit scoring models. Despite their popularity, quite a few limitations are known to be associated with these methods, such as being instable with high-dimensional data (also known as combinat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016